The advancedraincloud module provides enhanced
raincloud visualization using the modern ggrain
package, offering advanced features beyond standard raincloud plots.
This module is specifically designed for complex research scenarios
requiring longitudinal connections, Likert scale support, and flexible
positioning.
| Feature | Standard Raincloud (ggdist) | Advanced Raincloud (ggrain) |
|---|---|---|
| Package | ggdist | ggrain |
| Longitudinal Connections | ❌ | ✅ Connect repeated measures |
| Likert Scale Support | ❌ | ✅ Y-axis jittering for ordinal data |
| Positioning Options | Fixed | ✅ Left/Right/Flanking |
| Covariate Mapping | ❌ | ✅ Point color remapping |
| Clinical Applications | Basic distribution | ✅ Complex research designs |
We’ll demonstrate using the comprehensive histopathology
dataset:
# Load the histopathology dataset
data("histopathology")
# Dataset structure for raincloud plots
str(histopathology[c("Group", "Age", "OverallTime", "Sex", "Grade_Level", "ID")])
#> tibble [250 × 6] (S3: tbl_df/tbl/data.frame)
#> $ Group : chr [1:250] "Control" "Treatment" "Control" "Treatment" ...
#> $ Age : num [1:250] 27 36 65 51 58 53 33 26 25 68 ...
#> $ OverallTime: num [1:250] 3.5 3.1 3.1 4.9 3.3 9.3 6.3 9 5.8 9.9 ...
#> $ Sex : chr [1:250] "Male" "Female" "Male" "Male" ...
#> $ Grade_Level: chr [1:250] "high" "low" "low" "high" ...
#> $ ID : num [1:250] 1 2 3 4 5 6 7 8 9 10 ...
# Key variables for advanced raincloud plots
cat("Variables suitable for advanced raincloud visualization:\n")
#> Variables suitable for advanced raincloud visualization:
cat("Y-axis (continuous):", paste(c("Age", "OverallTime", "MeasurementA", "MeasurementB"), collapse = ", "), "\n")
#> Y-axis (continuous): Age, OverallTime, MeasurementA, MeasurementB
cat("X-axis (grouping):", paste(c("Group", "Sex", "Grade_Level"), collapse = ", "), "\n")
#> X-axis (grouping): Group, Sex, Grade_Level
cat("Fill variable:", paste(c("Sex", "Grade_Level"), collapse = ", "), "\n")
#> Fill variable: Sex, Grade_Level
cat("ID variable:", "ID (for longitudinal connections)\n")
#> ID variable: ID (for longitudinal connections)
# Check for repeated measures (simulated)
set.seed(123)
histopathology$ID <- rep(1:125, 2) # Simulate paired data
histopathology$Time <- rep(c("Baseline", "Follow-up"), each = 125)
cat("Simulated longitudinal data: 125 patients with baseline and follow-up measurements\n")
#> Simulated longitudinal data: 125 patients with baseline and follow-up measurementsLet’s start with a basic advanced raincloud plot:
# Basic advanced raincloud plot
advancedraincloud(
data = histopathology,
y_var = "Age",
x_var = "Group",
plot_title = "Age Distribution by Treatment Group",
show_statistics = TRUE,
show_interpretation = TRUE
)# Reproducible R code using ggrain
library(ggrain)
library(ggplot2)
# Filter out missing values
histopathology_clean <- histopathology %>%
filter(!is.na(Group) & !is.na(Age))
# Basic advanced raincloud with ggrain
basic_plot <- ggplot(histopathology_clean, aes(x = Group, y = Age, fill = Group)) +
geom_rain(rain.side = "l") +
scale_fill_manual(values = c("#2E86AB", "#A23B72")) +
theme_minimal() +
labs(
title = "Advanced Raincloud Plot - Basic Example",
x = "Treatment Group",
y = "Age (years)"
) +
theme(legend.position = "none")
print(basic_plot)The most powerful feature of advanced raincloud plots is connecting repeated observations:
# Advanced raincloud with longitudinal connections
advancedraincloud(
data = histopathology,
y_var = "OverallTime",
x_var = "Time",
id_var = "ID",
show_longitudinal = TRUE,
plot_title = "Overall Survival Time: Baseline vs Follow-up",
x_label = "Time Point",
y_label = "Overall Time (months)"
)# Create simulated longitudinal data
longitudinal_data <- histopathology %>%
filter(!is.na(OverallTime)) %>%
slice_head(n = 50) %>% # Smaller sample for clearer visualization
select(-Time) %>% # Remove existing Time column to avoid duplicates
mutate(
Baseline = OverallTime,
Follow_up = OverallTime + rnorm(n(), 2, 1), # Simulated follow-up
ID = row_number()
) %>%
tidyr::pivot_longer(
cols = c(Baseline, Follow_up),
names_to = "Time",
values_to = "Measurement"
)
# Advanced raincloud with longitudinal connections
longitudinal_plot <- ggplot(longitudinal_data, aes(x = Time, y = Measurement, fill = Time)) +
geom_rain(
id.long.var = "ID", # Connect observations by ID
rain.side = "f" # Flanking rainclouds
) +
scale_fill_manual(values = c("#1f77b4", "#ff7f0e")) +
theme_minimal() +
labs(
title = "Advanced Raincloud with Longitudinal Connections",
x = "Time Point",
y = "Measurement Value",
caption = "Lines connect repeated observations from the same subjects"
)
#> Warning: Option rain.side 'flanking' is being used with a side argument in violin.args.pos!!!
#>
#>
#> If you want the nudging position defaults for a flanking 1-by-1 raincloud use (rain.side = 'f1x1')
#>
#> If you want the nudging position defaults for a flanking 2-by-2 raincloud use (rain.side = 'f2x2')
#>
#>
#> Now defaulting to a 2-by-2
print(longitudinal_plot)
#> Warning in compute_layer(..., self = self): Argument 'x' longer than data: some
#> values dropped!
#> Warning: Using the `size` aesthetic with geom_polygon was deprecated in ggplot2 3.4.0.
#> ℹ Please use the `linewidth` aesthetic instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.Advanced raincloud plots offer flexible positioning:
# Left-side raincloud (default)
advancedraincloud(
data = histopathology,
y_var = "Age",
x_var = "Sex",
rain_rain.side = "l",
plot_title = "Left-side Raincloud"
)
# Right-side raincloud
advancedraincloud(
data = histopathology,
y_var = "Age",
x_var = "Sex",
rain_rain.side = "r",
plot_title = "Right-side Raincloud"
)
# Flanking rainclouds (both sides)
advancedraincloud(
data = histopathology,
y_var = "Age",
x_var = "Sex",
rain_rain.side = "f",
plot_title = "Flanking Rainclouds"
)# Create comparison plots
library(gridExtra)
#>
#> Attaching package: 'gridExtra'
#> The following object is masked from 'package:dplyr':
#>
#> combine
# Left-side raincloud
histopathology_clean <- histopathology %>%
filter(!is.na(Sex) & !is.na(Age))
left_plot <- ggplot(histopathology_clean, aes(x = Sex, y = Age, fill = Sex)) +
geom_rain(rain.side = "l") +
scale_fill_manual(values = c("#e74c3c", "#3498db")) +
theme_minimal() +
labs(title = "Left-side Raincloud", x = "Sex", y = "Age") +
theme(legend.position = "none")
# Right-side raincloud
right_plot <- ggplot(histopathology_clean, aes(x = Sex, y = Age, fill = Sex)) +
geom_rain(rain.side = "r") +
scale_fill_manual(values = c("#e74c3c", "#3498db")) +
theme_minimal() +
labs(title = "Right-side Raincloud", x = "Sex", y = "Age") +
theme(legend.position = "none")
# Flanking rainclouds
flanking_plot <- ggplot(histopathology_clean, aes(x = Sex, y = Age, fill = Sex)) +
geom_rain(rain.side = "f") +
scale_fill_manual(values = c("#e74c3c", "#3498db")) +
theme_minimal() +
labs(title = "Flanking Rainclouds", x = "Sex", y = "Age") +
theme(legend.position = "none")
#> Warning: Option rain.side 'flanking' is being used with a side argument in violin.args.pos!!!
#>
#>
#> If you want the nudging position defaults for a flanking 1-by-1 raincloud use (rain.side = 'f1x1')
#>
#> If you want the nudging position defaults for a flanking 2-by-2 raincloud use (rain.side = 'f2x2')
#>
#>
#> Now defaulting to a 2-by-2
# Display plots
grid.arrange(left_plot, right_plot, flanking_plot, ncol = 3)
#> Warning in compute_layer(..., self = self): Argument 'x' longer than data: some
#> values dropped!For ordinal survey data, advanced raincloud plots offer specialized support:
# Simulate Likert scale data
histopathology$Satisfaction <- sample(1:5, nrow(histopathology), replace = TRUE,
prob = c(0.1, 0.2, 0.4, 0.2, 0.1))
# Advanced raincloud for Likert data
advancedraincloud(
data = histopathology,
y_var = "Satisfaction",
x_var = "Group",
likert_mode = TRUE, # Enable Likert mode
plot_title = "Patient Satisfaction by Treatment Group",
y_label = "Satisfaction Score (1-5)"
)# Create Likert scale data
likert_data <- histopathology %>%
mutate(
Satisfaction = sample(1:5, n(), replace = TRUE, prob = c(0.1, 0.2, 0.4, 0.2, 0.1)),
Satisfaction_Factor = factor(Satisfaction, levels = 1:5,
labels = c("Very Poor", "Poor", "Average", "Good", "Excellent"))
)
# Advanced raincloud for Likert data with jittering
likert_plot <- ggplot(likert_data, aes(x = Group, y = Satisfaction, fill = Group)) +
geom_rain(
likert = TRUE, # Enable Likert mode for Y-axis jittering
rain.side = "l"
) +
scale_fill_manual(values = c("#2E86AB", "#A23B72")) +
scale_y_continuous(breaks = 1:5, labels = c("Very Poor", "Poor", "Average", "Good", "Excellent")) +
theme_minimal() +
labs(
title = "Patient Satisfaction by Treatment Group - Likert Scale",
x = "Treatment Group",
y = "Satisfaction Score",
caption = "Y-axis jittering applied for discrete ordinal responses"
) +
theme(legend.position = "none")
#> Likert = T; setting y axis jittering for point & line to .1
print(likert_plot)
#> Warning: Groups with fewer than two datapoints have been dropped.
#> ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.Advanced raincloud plots support point color remapping based on additional variables:
# Advanced raincloud with covariate mapping
advancedraincloud(
data = histopathology,
y_var = "OverallTime",
x_var = "Group",
cov_var = "Age", # Map point colors to age
plot_title = "Overall Time by Group with Age Mapping",
show_statistics = TRUE
)# Advanced raincloud with covariate mapping
histopathology_cov <- histopathology %>%
filter(!is.na(Group) & !is.na(OverallTime) & !is.na(Age))
covariate_plot <- ggplot(histopathology_cov, aes(x = Group, y = OverallTime, fill = Group)) +
geom_rain(
rain.side = "l",
cov = "Age" # Map point colors to age covariate
) +
scale_fill_manual(values = c("#2E86AB", "#A23B72")) +
scale_color_viridis_c(name = "Age") + # Continuous color scale for age
theme_minimal() +
labs(
title = "Overall Time by Group with Age Covariate Mapping",
x = "Treatment Group",
y = "Overall Time (months)",
caption = "Point colors represent patient age"
)
print(covariate_plot)The module includes 6 professional color palettes:
# Clinical palette (default)
advancedraincloud(data = histopathology, y_var = "Age", x_var = "Grade_Level",
color_palette = "clinical")
# Viridis palette
advancedraincloud(data = histopathology, y_var = "Age", x_var = "Grade_Level",
color_palette = "viridis")
# Pastel palette
advancedraincloud(data = histopathology, y_var = "Age", x_var = "Grade_Level",
color_palette = "pastel")# Create palette comparison
grade_data <- histopathology %>% filter(!is.na(Grade_Level))
# Clinical palette
clinical_colors <- c("#2E86AB", "#A23B72", "#F18F01")
clinical_plot <- ggplot(grade_data, aes(x = Grade_Level, y = Age, fill = Grade_Level)) +
geom_rain() +
scale_fill_manual(values = clinical_colors) +
theme_minimal() +
labs(title = "Clinical Palette") +
theme(legend.position = "none")
# Viridis palette
viridis_plot <- ggplot(grade_data, aes(x = Grade_Level, y = Age, fill = Grade_Level)) +
geom_rain() +
scale_fill_viridis_d() +
theme_minimal() +
labs(title = "Viridis Palette") +
theme(legend.position = "none")
# Pastel palette
pastel_colors <- c("#FFB3BA", "#BAFFC9", "#BAE1FF")
pastel_plot <- ggplot(grade_data, aes(x = Grade_Level, y = Age, fill = Grade_Level)) +
geom_rain() +
scale_fill_manual(values = pastel_colors) +
theme_minimal() +
labs(title = "Pastel Palette") +
theme(legend.position = "none")
# Display palette comparison
grid.arrange(clinical_plot, viridis_plot, pastel_plot, ncol = 3)
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_half_ydensity()`).
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_boxplot()`).
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_point_sorted()`).
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_half_ydensity()`).
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_boxplot()`).
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_point_sorted()`).
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_half_ydensity()`).
#> Warning: Removed 1 row containing non-finite outside the scale range
#> (`stat_boxplot()`).
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_point_sorted()`).Advanced raincloud plots include comprehensive statistical analysis:
# With summary statistics and group comparisons
advancedraincloud(
data = histopathology,
y_var = "Age",
x_var = "Group",
show_statistics = TRUE, # Summary statistics table
show_comparisons = TRUE, # Statistical tests
show_interpretation = TRUE # Feature guide
)The module automatically selects appropriate tests:
# Demonstrate statistical testing
group_data <- histopathology %>% filter(Group %in% c("Control", "Treatment"))
# Wilcoxon test for two groups
wilcox_result <- wilcox.test(Age ~ Group, data = group_data)
cat("Statistical Test Results:\n")
#> Statistical Test Results:
cat("Wilcoxon rank-sum test\n")
#> Wilcoxon rank-sum test
cat("W =", wilcox_result$statistic, "\n")
#> W = 7955.5
cat("p-value =", format.pval(wilcox_result$p.value, digits = 3), "\n")
#> p-value = 0.626
# Summary statistics by group
summary_stats <- group_data %>%
group_by(Group) %>%
summarise(
n = n(),
mean = round(mean(Age, na.rm = TRUE), 1),
median = round(median(Age, na.rm = TRUE), 1),
sd = round(sd(Age, na.rm = TRUE), 1),
.groups = 'drop'
)
print(summary_stats)
#> # A tibble: 2 × 5
#> Group n mean median sd
#> <chr> <int> <dbl> <dbl> <dbl>
#> 1 Control 120 49.8 49.5 14.4
#> 2 Treatment 129 49 49 13.3Advanced raincloud plots offer extensive customization:
# Highly customized advanced raincloud
advancedraincloud(
data = histopathology,
y_var = "OverallTime",
x_var = "Group",
fill_var = "Sex",
point_size = 2.0, # Larger points
point_alpha = 0.8, # Point transparency
violin_alpha = 0.6, # Violin transparency
boxplot_width = 0.15, # Boxplot width
plot_title = "Customized Advanced Raincloud Plot",
x_label = "Treatment Group",
y_label = "Overall Survival Time (months)"
)# Load required libraries
library(ggrain)
library(ggplot2)
# Filter out any missing values to prevent errors
histopathology_clean <- histopathology %>%
filter(!is.na(Group) & !is.na(OverallTime) & !is.na(Sex))
# Highly customized plot
custom_plot <- ggplot(histopathology_clean, aes(x = Group, y = OverallTime, fill = Sex)) +
geom_rain(
rain.side = "l", # Changed from side to side
point.args = list(size = 2.0, alpha = 0.8), # Custom points
violin.args = list(alpha = 0.6), # Custom violin
boxplot.args = list(width = 0.15) # Custom boxplot
) +
scale_fill_manual(values = c("#e74c3c", "#3498db")) +
theme_minimal() +
theme(
plot.title = element_text(size = 14, face = "bold", hjust = 0.5),
axis.title = element_text(size = 12),
legend.position = "bottom"
) +
labs(
title = "Customized Advanced Raincloud Plot",
x = "Treatment Group",
y = "Overall Survival Time (months)",
fill = "Sex"
)
#> Warning: Duplicated aesthetics after name standardisation: width
print(custom_plot)# Simulate before/after treatment data
treatment_data <- data.frame(
Patient_ID = rep(1:30, 2),
Time = rep(c("Before", "After"), each = 30),
Biomarker = c(
rnorm(30, 100, 15), # Before treatment
rnorm(30, 85, 12) # After treatment (improvement)
),
Treatment = rep(c("Drug A", "Drug B"), each = 15, times = 2)
)
# Advanced raincloud with connections
before_after_plot <- ggplot(treatment_data, aes(x = Time, y = Biomarker, fill = Treatment)) +
geom_rain(
id.long.var = "Patient_ID", # Connect before/after for same patients
rain.side = "f" # Flanking for comparison
) +
scale_fill_manual(values = c("#2E86AB", "#A23B72")) +
theme_minimal() +
labs(
title = "Before/After Treatment Analysis with Patient Connections",
x = "Time Point",
y = "Biomarker Level",
fill = "Treatment",
caption = "Lines connect measurements from the same patients"
)
#> Warning: Option rain.side 'flanking' is being used with a side argument in violin.args.pos!!!
#>
#>
#> If you want the nudging position defaults for a flanking 1-by-1 raincloud use (rain.side = 'f1x1')
#>
#> If you want the nudging position defaults for a flanking 2-by-2 raincloud use (rain.side = 'f2x2')
#>
#>
#> Now defaulting to a 2-by-2
print(before_after_plot)# Simulate clinical trial data
trial_data <- histopathology %>%
mutate(
Treatment_Arm = case_when(
Group == "Control" ~ "Placebo",
Group == "Treatment" ~ sample(c("Low Dose", "High Dose"), n(), replace = TRUE)
),
Response_Score = case_when(
Treatment_Arm == "Placebo" ~ rnorm(n(), 3, 1),
Treatment_Arm == "Low Dose" ~ rnorm(n(), 5, 1.2),
Treatment_Arm == "High Dose" ~ rnorm(n(), 7, 1.5)
)
) %>%
filter(!is.na(Treatment_Arm))
# Multi-group comparison
trial_plot <- ggplot(trial_data, aes(x = Treatment_Arm, y = Response_Score, fill = Treatment_Arm)) +
geom_rain(rain.side = "l") +
scale_fill_manual(values = c("#95a5a6", "#3498db", "#e74c3c")) +
theme_minimal() +
labs(
title = "Clinical Trial: Treatment Response by Dose",
x = "Treatment Arm",
y = "Response Score",
caption = "Advanced raincloud plot showing distribution differences"
) +
theme(legend.position = "none")
print(trial_plot)# Always check your data structure
cat("Data preparation checklist:\n")
#> Data preparation checklist:
cat("✓ Y-variable: continuous/numeric\n")
#> ✓ Y-variable: continuous/numeric
cat("✓ X-variable: categorical/factor\n")
#> ✓ X-variable: categorical/factor
cat("✓ ID variable: unique identifier for each subject\n")
#> ✓ ID variable: unique identifier for each subject
cat("✓ No excessive missing values\n")
#> ✓ No excessive missing values
# Example data check
data_check <- histopathology %>%
summarise(
n_total = n(),
n_complete = sum(complete.cases(Group, Age)),
missing_pct = round((n_total - n_complete) / n_total * 100, 1)
)
cat("\nData completeness:", data_check$n_complete, "of", data_check$n_total,
"cases (", data_check$missing_pct, "% missing)\n")
#>
#> Data completeness: 248 of 250 cases ( 0.8 % missing)cat("Advanced Raincloud Plot Interpretation:\n")
#> Advanced Raincloud Plot Interpretation:
cat("📊 Violin: Shows probability density distribution\n")
#> 📊 Violin: Shows probability density distribution
cat("📦 Box: Median, quartiles, and outliers\n")
#> 📦 Box: Median, quartiles, and outliers
cat("🔵 Points: Individual observations\n")
#> 🔵 Points: Individual observations
cat("📈 Connections: Longitudinal relationships (when enabled)\n")
#> 📈 Connections: Longitudinal relationships (when enabled)
cat("🎨 Colors: Group or covariate distinctions\n")
#> 🎨 Colors: Group or covariate distinctionsThe advanced raincloud module requires:
Advanced raincloud plots represent a significant enhancement over standard raincloud visualization, providing:
For additional statistical plotting modules in ClinicoPath: -
raincloud for standard raincloud plots -
jjscatterstats for correlation visualization -
jjbarstats for categorical data analysis -
jjhistostats for distribution analysis